Tensorflow DNN Classifier

In this article, we demonstrate solving a classification problem in TensorFlow using Estimators using the UCI ML Wine recognition dataset. This dataset also can be accessed via the scikit-learn datasets.

Dataset Information:

These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines.

Data Correlations

Let's take a look at the variance of the features.

Furthermore, we would like to standardize features by removing the mean and scaling to unit variance.

Train and Test sets

Input Function

The input function specifies how data is converted to a tf.data.Dataset that feeds the input pipeline in a streaming fashion. Moreover, an input function is a function that returns a tf.data.Dataset object which outputs the following two-element tuple:

Moreover, an estimator model consists of two main parts, feature columns, and a numeric vector. Feature columns provide explanations for the input numeric vector. The following function separates categorical and numerical columns (features)and returns a descriptive list of feature columns.

Estimator using the default optimizer

Predictions

ROC Curves

Confusion Matrix

Estimator using an optimizer with a learning rate decay

In this classification, the learning rate of your optimizer changes over time.

Predictions

ROC Curves

Confusion Matrix


References

  1. Regression analysis Wikipedia page
  2. Tensorflow tutorials
  3. Online machine learning Wikipedia page
  4. Learning rate Wikipedia page
  5. S. Aeberhard, D. Coomans and O. de Vel, Comparison of Classifiers in High Dimensional Settings, Tech. Rep. no. 92-02, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland. (Also submitted to Technometrics).
  6. S. Aeberhard, D. Coomans and O. de Vel, “THE CLASSIFICATION PERFORMANCE OF RDA” Tech. Rep. no. 92-01, (1992), Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland. (Also submitted to Journal of Chemometrics).